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Abstract 



A generalization of the free energy perturbation identity is derived, and a 
computational strategy based on this result is presented. A simple example 
illustrates the efficiency gains that can be achieved with this method. 

The development of efficient methods for the numerical estimation of free energy differ- 
ences remains an outstanding problem in the computational sciences with applications 
as diverse as rational drug design || , ab initio prediction of material properties || , and the 
study of condensates in non-perturbative QCD ||. Many schemes for estimating free energy 
differences trace their origins to the 'perturbation identity ||, 

(e~ AE/kT ) A = e~ AF ' kT . (1) 

Here, AF = Fb — Fa is the Helmholtz free energy difference between two equilibrium states 
of a system, A and B, defined at a common temperature T but different settings of external 
parameters. The variable x (and later y) denotes a microstate of the system, e.g. a point 
in configuration space or phase space; Ea{x) and Eb(x) denote the internal energy as a 
function of microstate, for the two parameter settings; and 

A£(x) = £ B (x) - £ A (x) (2) 

is the energy difference associated with changing the external parameters from one setting 
to the other, while holding fixed the microstate. Finally, (• • -)a denotes an average over 
microstates sampled from the canonical distribution representing state A. 

The traditional perturbation approach to estimating AF is a direct application of Eq.|l]: 
the quantity exp(— AE/kT) is averaged over microstates sampled from ensemble A. |3j 
However, this method converges poorly when there is little overlap in configuration space 
between ensembles A and B. Intuitively, this makes sense: if there is little overlap, then 
we very slowly accumulate information about state B by generating microstates typical of 
state A. 

The aim of this paper is to present a generalization of Eq.|l], as well as a computational 
method, targeted free energy perturbation, based on this result. The practitioner of this 
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method must attempt to construct an invertible transformation Ai, under which ensemble 
A gets mapped onto an ensemble A' which overlaps significantly with B (see Eqs.13, 14). The 



more successful this attempt, the more rapidly the method converges. Indeed, if A' and B 
overlap perfectly, then convergence is immediate! This strategy thus provides a mechanism 
for taking advantage of prior knowledge about states A and B (used to construct the mapping 
Ai) in order to speed up the estimation of AF. 

While the central result and method derived below are new, the use of invertible map- 
pings to enhance the efficiency of free energy calculations has precedents. For simple dis- 
placements, x — > x + d, the method proposed herein is closely related to one developed years 
ago by Voter [j7|], for energy functions E A and Eb which resemble one another apart from 
a spatial translation. Bruce et al |§ have proposed the use of invertible transformations 
as collective Monte Carlo moves - "lattice switches" - to enable the sampling of disparate 
regions of configuration space. Finally, our method is similar in spirit to the metric scaling 
scheme developed by Miller and Reinhardt ||, whereby one attempts to "guide" the sys- 
tem in question through a continuous sequence of equilibrium states, by dynamically, and 
linearly, distorting the space in which the constituent particles evolve. 

We now derive our central result, Eq.|10| below. 

Consider an invertible transformation of configuration space onto itself: 

A4:x-y(x). (3) 

This might be a displacement as in Ref. 0, or perhaps a scaling transformation as in Ref. 
||, or it might be a considerably more complicated, nonlinear mapping. Now imagine 
microstates xi,X2, ■ ■ • sampled from some (so far, arbitrary) primary ensemble, represented 
by a distribution p(x); and construct their images under the transformation: yi,y2,---, 
where y n = Ai(x. n ). The y's are, effectively, sampled from a secondary ensemble, which is 
the image of the primary ensemble under Ai. This secondary ensemble is represented by a 
distribution p(-), related to the primary distribution p(-) by: 

rj(y) = p(x)/J(x), (4) 

where J(x) = |<9y/<9x| is the Jacobian of the mapping At. Here and henceforth, when the 
variables x and y appear together, it will be understood that they are related by y = A^(x). 
Next, define a function 

$(x) = E B (y) - E A (x) - kT In J(x), (5) 

let the primary ensemble be the canonical distribution corresponding to state A, 

p(x) = ± c -*AM/*r ( 6 ) 

A A 

and evaluate the average of exp(— <&/kT) over points x sampled from this ensemble: 

-*/w^ = y dxp ( x ) e -*W/*T (7) 

= -L [ dxJ(x)e- £fl ( y )/ feT (8) 
Z A J 

— f d ye- E ^/ kT = (9) 

Za 
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where Za and Zb are partition functions. (Note the change in the variable of integration: 
/ J dx • • • = / dy ■ ■ •.) Invoking the relation F = —kT In Z, we finally obtain 



,-*/fcX\ _ e -AF/kT 
I A 



(10) 



Let us now turn our attention to the application of this result to the problem of estimating 
AF. 

Eq.|ID| generalizes of the free energy perturbation identity, reducing to the latter in the 
special case Ai : x — > x. However, Eq.[l0] is valid for arbitrary invertible transformations 
Ai. It is plausible that one can take advantage of this generality to enhance the efficiency of 
computing AF. That is, there may exist mappings Ai for which the average of exp(— <&/kT) 
converges more rapidly than the average of exp(—AE/kT). 

To investigate this possibility, consider p((p\A4), the distribution of values of <fi = $(x), 
for x sampled from A (Eq.|j). Eq.[l0] asserts that 

J d0p(0|M)e-*/ fcT = e~ Ap / kT , (11) 

for any choice of Ai. In practice, we estimate AF by averaging exp(— <$>/kT) over a finite 
number of sampled microstates xi, x 2 , • • • , x^: 

1 N 

J_ ST e -^ kT « e - AF / fcT , (12) 

N n=l 

where <p n = $(x n ). This approximation becomes an equality as iV — » oo, but the rate 
of convergence depends strongly on the choice of Ai] roughly speaking, the narrower the 
distribution p(<f)\Ai), the faster the convergence. Therefore we are faced with the practical 
question, how do we choose Ai so as to maximize the rate of convergence of the left side of 
Eq.|12|? 

Recall that poor convergence of the usual perturbation method is a symptom of too 
little overlap between A and B in configuration space: we then learn little about B when 
sampling from A. Now note that, when generating the sequence of (f> n 's, we are effectively 
harvesting information from two ensembles; the points x sample the primary ensemble A, 
while the points y sample the secondary ensemble, A', which is the image of A under the 
transformation Ai: 

A ^ A'. (13) 

Intuition suggests that, since the primary ensemble represents state A (by construction), we 
ought to attempt to maximize the overlap between the secondary ensemble and state B, 

A' w B, (14) 

so as to speedily gain information about both of the equilibrium states (A and B) which 
interest us. 

Pursuing this line of intuition, let us first consider the extreme case, and define a perfect 
transformation Ai* to be one under which A maps exactly onto B: 
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P(x) = ^e-^w/^ v(y) = ±e- E ^y kT . (15) 

A ^B 

By Eq.f|, this implies 

F B - E B (y) = F A - E A (x) - kT In J(x), (16) 

in other words $(x) = AF for all x. Hence, we have a maximally narrow distribution of 
values of (j>: 

p(4>\M*) = 5(4> — AF). (17) 

Thus, the convergence of Eq.\T^ is immediate if the transformation is perfect: $(x) = AF 
for every sampled x. 

Unfortunately, constructing a perfect transformation is likely to be much more difficult 
than the original problem of computing AF. However, it stands to reason that if p(<p\A4) 
is a delta-function when A' = B, then it will remain narrow when A' rs B. Eq.[H] thus 
gives credence to our earlier intuition (Eq.[l4]): we ought indeed to look for a transformation 
under which A' enjoys good overlap with B. A close resemblance between A' and B implies 
a narrow distribution of 0's, which in turn implies rapid convergence of our estimate of AF. 

Let us summarize what has been stated to this point. Eq{TI] suggests a method of estimat- 
ing AF: microstates x n are sampled from the canonical ensemble A; the value cj) n = $(x n ) 
is computed for each sampled microstate; and the estimator X N = (1/N) J2 n exp(— <j> n /kT) 
converges to exp(— AF/kT) as N —>■ oo. Two ingredients of this scheme are: an invert- 
ible mapping A4, and the image A' of the canonical ensemble A under A4. If A' coincides 
with B, then the method converges immediately. Hence, if we choose a transformation M. 
which significantly improves the overlap with B, without necessarily being "perfect", then 
Xjy ought to converge more rapidly with N than the traditional free energy perturbation 
estimator, X™ p = (1/N) J2 n exp(— AE n /kT). We will refer to this method as targeted free 
energy perturbation, since its successful implementation requires finding a transformation M. 
for which the secondary ensemble A' comes reasonably close to "hitting" the target ensemble, 
B. 

Several potential extensions of targeted free energy perturbation, to be developed more 
fully elsewhere, suggest themselves. First, for a parameter-dependent energy function 
F(x, A), the application of Eq.[l(], to states A and B defined by infinitesimally different 
values of A, leads to the identity 

f = (f + u .v*-*rv.u), («) 



where u(x) is an arbitrary different iable vector field of bounded magnitude |L0| While this 
result reduces to the widely used thermodynamic integration identity |TT| for u = 0, other 
choices of u might accelerate convergence of the average. It is also straightforward to in- 



corporate Eq.|lO| into the umbrella sampling [12] and overlapping distributions JT3[ methods. 



Finally, a nonlinear metric scaling scheme - with particular potential for enhancing the ef- 
ficiency of free energy calculations based on steered molecular dynamics |L4[] - might result 
from combining the approach of the present paper with that of Ref . [0] . 
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The utility of targeted free energy perturbation depends critically on our ability to con- 
struct a mapping Ai appropriate to the problem at hand. While intuition will in some 
cases reliably suggest a candidate, in others it may be very difficult or computationally 
expensive to devise a mapping which improves the overlap with ensemble B. In the case 



of non- diffusive systems, however, a promising and quite general strategy exists. ]15| For 
a quasi-rigid system such as a large molecule, the canonical ensemble occupies a strongly 
localized region of configuration space (assuming that the translational and rotational de- 
grees of freedom of the entire molecule have been integrated out, or else pinned down by a 
constraining potential). Given two such molecules, A and B - alchemically different, hence 



represented by different energy functions ]TJ| - we can roughly approximate the associated 



canonical ensembles by Gaussian distributions in the many-dimensional configuration space. 
fTTfl A reasonable candidate for Ai is then the linear transformation which converts one of 
these Gaussians into the other. Even if the Gaussian approximation is quite crude, the 
mapping thus constructed is likely to result in a significantly improved overlap between A' 
and B (relative to that between A and B). 

We end this paper with numerical results illustrating the targeted free energy pertur- 
bation method. The setting is the expansion of a cavity in a fluid. While the aim here is 
simply a comparison between methods, it bears mention that recent years have seen renewed 
theoretical interest in the problem of cavity formation in fluids [0 , both as a fundamental 
problem in physical chemistry, and because of the role played by hydrophobicity in deter- 
mining and stabilizing protein structure. 

Consider n p point molecules confined within a cubic container of volume L 3 , but excluded 
from a spherical cavity of radius R located at the center (r = 0) of the container. Assume 
periodic boundary conditions and a pairwise interaction V^ n t( r i) r j) between molecules. We 
can write the energy function for such a fluid as: 

£(x; R) = 0(x; R) + £ V iQt (r u Tj ). (19) 

Here, x = (ri, • • • , r„ p ) specifies the microstate, and 0(x;i?) is either (if all r*j > R) 
or +oo (otherwise), thus enforcing the exclusion of molecules from the spherical cavity. 
Treating the cavity radius R as an external parameter, let us choose two values, Ra and 
Rb, satisfying < Ra < Rb < L/2, and let A and B denote the corresponding equilibrium 
states (canonical ensembles), at a given temperature T. We want to compute the associated 
free energy difference, AF = Fb — Fa- Physically, this is the reversible, isothermal work 
required to expand the cavity radius from Ra to Rb- 

The quantity exp(— AF/kT) is equal to the probability P that, given a microstate x 
sampled from ensemble A, the region Ra < r < Rb will be devoid of molecules. This can be 
viewed as a consequence of Eq.|l[ noting that AE is equal to either or +oo, depending on 
whether or not this region is vacant. The application of the traditional perturbation method 
amounts to evaluating this probability by straight sampling. 

Let us now try to construct a transformation M. which improves the efficiency of es- 
timating P = exp(— AF/kT). With the traditional method, poor convergence arises if 
P < 1, i.e. if for nearly every microstate x G A, there will be molecules located in the 
region Ra < r < Rb- Therefore let us choose M. so as to vacate this region. A candidate 
transformation [B5|, acting on each particle independently, is: 



5 



g{n)Yi , i = !,-••, n p , (20) 



where 

9(r) 



, (R% ll\)(l 3 -^ 



(L 3 - 8R 3 A )r 
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(21) 



if Ra < r < L/2, and g(r) = 1 otherwise. Under this transformation, the region of space 
defined by Ra < r < L/2 gets uniformly compressed into the region Rb < r < L/2. For any 
x sampled from A, the quantity Es{y) — Ea(x) is just the change in the total interaction 
energy (J2 Vint) resulting from this compression, and 

J(x) = [(L 3 - 8i?|)/(L 3 - 8i$)]"« (22) 

where z/(x) is the number of molecules in the region Ra < r < L/2. 

We simulated 125 molecules inside a container of sides L = 22.28 A, at T = 300 K. A 
Lennard- Jones interaction between molecules was used, with parameters corresponding to 
argon |20| (a = 3.542A, e = 0.1854kcal/mol). The values of Ra and Rb were taken to be 
9.209 A and 9.386 A, respectively. 

Sampling from ensemble A was achieved with the Metropolis algorithm. Three inde- 
pendent runs were carried out, each consisting of 500 initial relaxation sweeps followed by 
2 x 10 5 production sweeps. These runs were used to estimate P = exp(— AF/ kT), using both 
the traditional perturbation approach (i.e. observing the frequency with which the region 
Ra < r < Rb is spontaneously vacant) and targeted perturbation (Eq.^2j). In Fig.|I|, each 
red curve shows the traditional perturbation estimate of P for a single run, accumulating as 
a function of number of production sweeps, iV (plotted in increments of AiV = 1000). The 
blue curves show the targeted perturbation estimates for the same runs. It is evident that 
the latter converge much faster than the former. Combining the data from all three runs, the 
two methods yield the estimates P^ d = (4.83 ±0.49) x 10" 4 and P t e ^ g = (5.81 ±0.05) x 10" 4 . 
The error bars are la, and their ratio suggests that efficiency in this case is improved by 
about two orders of magnitude by using targeted (rather than traditional) free energy per- 
turbation. 

It is a pleasure to acknowledge stimulating discussions and correspondence with Graeme 
Ackland, Alastair Bruce, Angel Garcia, Gabriel Istrate, Lawrence Pratt, and Nigel Wilding. 
This research is supported by the Department of Energy, under contract W-7405-ENG-36. 
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FIGURES 

FIG. 1. Traditional and targeted free energy perturbation estimates of P = exp(— AF/kT), 
as a function of number of MC sweeps. 
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